Graph Search Algorithms



Graph Traversal Algorithms

Graph traversal refers to a process that traverses vertices of a graph following a certain order (starting from user-input sources). This category of graph search algorithms only seeks to find a path between two nodes, without optimizing for the length of the final route. In applications where the weight of edges in a graph are all equal (e.g. 1), BFS and DFS algorithms outperform shortest path algorithms like Dijkstra’s.

Breadth-first Search (BFS)

BFS is an algorithm where the traversal starts at a specified node (the source or starting node) and continues along the graph layerwise, thus exploring all exploring all of the the current node’s neighbouring nodes (those which are directly connected to the current node). If a result is not found, the algorithm proceeds to search the next-level neighbour nodes.

BREADTH-FIRST-SEARCH(source,destination) return a route
frontiera FIFO initialized with source node
exploredempty
foundFalse

while frontier is not empty and found is False do
nodefrontier.pop()
add node to explored
for child in node.expand() do
if child is not in explored and child is not in frontier then
if child is destination then
routechild.route()
foundTrue
add child to frontier
return route

Using BFS, search for the shortest path between The Equestrian Statue and the Bahen Centre. This example uses the same data as in Getting Started.

Note

This book uses the smart_mobility_utilities package for some operations, in order to simplify the process of visualizing graphs. You can find out more about downloading and installing the package here.

Let’s first find the largest connected component centered around our location, with a specified distance on each side. The reference point is the centre of the University of Toronto’s downtown campus.

import osmnx
reference = (43.661667, -79.395)
G = osmnx.graph_from_point(reference, dist=500, clean_periphery=True, simplify=True)

To plot the network, we will also need to highlight the starting and ending nodes. For the sake of simplicity, we will use the node id directly. For more information on finding the closest node to a given coordinate, refer back to the Getting Started section’s example.

highlighted = [1907446268, 1633421938]

# marking both the source and destination node

nc = ['red' if node in highlighted else '#336699' for node in G.nodes()]
ns = [50 if node in highlighted else 8 for node in G.nodes()]
fig, ax = osmnx.plot_graph(G, node_size=ns, node_color=nc, node_zorder=2)
../_images/GraphSearchAlgorithms_6_0.png

Let’s visualize the above graph on a ipyleaflet map, using a helper function from the smart_mobility_utilities package.

from smart_mobility_utilities.viz import draw_map
draw_map(G,highlight=highlighted, force_leaflet=True)

Warning

For the purposes of this map, we use the force_leaflet option so that the map will be rendered by ipyleaflet. Normally, when there are more than 1,000 nodes in a graph, ipyleaflet performance is very slow. The visualization tools in smart_mobility_utilities will automatically switch to folium when there are more than 1,000 nodes, unless the force_leaflet flag is used. See the docs for smart_mobility_utilities for more information.

Currently, each node in the above graph is represented as a python dict with many attributes that are of no interest to us. This makes accessing certain properties of nodes overly complicated and verbose. To minimize this, we can use the Node class from smart_mobility_utilities.common to redefine the nodes, and only retain key information like parent, edge length from parent, and the node itself. You can view the structure of Node here.

from smart_mobility_utilities.common import Node, cost
from tqdm.notebook import tqdm
from collections import deque
from time import process_time

# First convert the source and destination nodes to Node
origin = Node(graph=G, osmid=1907446268)
destination = Node(graph=G, osmid=1633421938)

# Setup a progress bar using tqdm, max is V+E
time_complexity = len(G.nodes)+len(G.edges)
bar = tqdm(total=time_complexity)

# Process run time tracking for later analysis
start_bfs = process_time()

route = []
frontier = deque([origin])
explored = set()
found = False


while frontier and not found:
    node = frontier.popleft()
    explored.add(node)
    for child in node.expand():
        if child not in explored and child not in frontier:
            if child == destination:
                route = child.path()
                found = True
            frontier.append(child)
        bar.update(1)

end_bfs = process_time() - start_bfs

bar.close()
print("Route: \n",route,"\n\n Cost:\n",cost(G,route))
Route: 
 [1907446268, 55808224, 55808416, 55808284, 1721866234, 389678268, 4953810915, 389678267, 24960090, 24960068, 1258698109, 389678145, 24960070, 24960073, 24960076, 24960080, 6028561924, 5098988924, 389678131, 6028562356, 854322047, 389677908, 24959560, 242413453, 749951161, 7311057931, 389678216, 389678215, 389678226, 1633421933, 1633421938] 

 Cost:
 1385.116

Note

We used collections.deque instead of a list because of the time-complexity required for pop and append. With pythonic list, these two operations take O(n) time, while collections.deque operates in O(1) time, as it is based on LinkedList.

Additionally, tqdm.notebook was used because we are rendering the progress in a Jupyter Notebook. For normal use, you can import directly from tqdm.

As you can see, the BFS algorithm managed to find the route from source to destination in under a second, having transversed 91% of the graph to find it. Let’s plot the route.

fig, ax = osmnx.plot_graph_route(G, route)
../_images/GraphSearchAlgorithms_12_0.png

Let’s also plot it on a map.

from smart_mobility_utilities.viz import draw_route
draw_route(G, route)

Depth-first Search (DFS)

The DFS algorithm is a recursive algorithm that uses the idea of backtracking. It involves exhaustive searches of all the nodes by going as deep as possible into the graph. When it reaches the last layer with no result, it “backtracks” up a layer and continues the search.

DEPTH-FIRST-SEARCH(source,destination) return a route
frontiera LIFO initialized with source node
exploredempty
foundFalse

while frontier is not empty and found is False do
nodefrontier.pop()
add node to explored
for child in node.expand() do
if child is not in explored and child is not in frontier then
if child is destination then
routechild.route()
foundTrue
add child to frontier
return route

As you may have the noticed, the only difference between DFS and BFS is in the way that frontier works. Rather than working down layer by layer (FIFO), DFS drills down to the bottom-most layer and moves its way back to the starting node (LIFO).

Let’s implement this algorithm with our previous example.

time_complexity = len(G.nodes) + len(G.edges) # Maximum is V + E
bar = tqdm(total=time_complexity) 

# Process run time for later analysis
start_dfs = process_time()

route = []
frontier = deque([origin])
explored = set()
found = False



while frontier and not found:
    node = frontier.pop()
    explored.add(node)
    for child in node.expand():
        if child not in explored and child not in frontier:
            if child == destination:
                route  = child.path()
                found = True
                continue
            frontier.append(child)
        bar.update(1)

end_dfs = process_time() - start_dfs

bar.close()
print("Route: \n",route,"\n\n Cost:\n",cost(G,route))
Route: 
 [1907446268, 55808205, 55808194, 55808408, 55808414, 8711144452, 55808328, 55808437, 3210497979, 389677988, 1686556839, 389677984, 50885180, 36607322, 389677990, 389677993, 390545921, 60654129, 60654120, 50897854, 50897859, 389678001, 389678002, 2143434369, 390550470, 389678003, 390548860, 389678004, 771950946, 984911356, 728157228, 306721042, 389678005, 2143487625, 389678007, 5277943137, 2498969982, 389677902, 390545068, 390545043, 306725181, 390545044, 771931704, 775377001, 771950967, 8608123052, 771931728, 8608123055, 8608123068, 3554867351, 390545045, 390545047, 390545049, 390545050, 389678203, 390545078, 390545077, 24959544, 389678013, 389678205, 389678206, 389678207, 3996667046, 3996667045, 389678209, 389678210, 389678054, 389678175, 1432347915, 389678044, 389678043, 215726254, 3983181527, 389678211, 389678177, 24959555, 389678042, 389678184, 389678183, 389678216, 389678215, 389678214, 24959557, 389678218, 389678219, 773004741, 773004737, 5567060881, 5567060879, 1005007860, 1005007861, 24959565, 24959569, 389678241, 7311150229, 7311150221, 7311150222, 7311150218, 7311142107, 7311142109, 24959589, 390545033, 152659384, 389678238, 389678239, 389678240, 389678225, 389678245, 389678229, 729406374, 389678246, 29604723, 2143440970, 1458386384, 391188278, 389678247, 249989991, 391188296, 249990004, 389678248, 389678249, 6123553651, 6532307387, 6532307390, 969631968, 409731632, 4579468982, 969631975, 389678250, 394502545, 394502565, 394502563, 2809034239, 4380884143, 4380884142, 4376693531, 389678136, 389677925, 389678134, 2557539827, 389678133, 389677909, 749952029, 389677908, 389678222, 7311057936, 7311057937, 6028562355, 2557542523, 389677907, 239055729, 389678039, 389678040, 389677889, 389678220, 749951161, 393676412, 7311036242, 7311057933, 2557539817, 2557539816, 389678227, 1633422235, 1633421933, 1633421938] 

 Cost:
 6092.076
draw_route(G,route)

It is very evident that the paths generated by our DFS and BFS implementations are not the most direct route. This is because both DFS and BFS are algorithms that can find routes between two nodes, but make no guarantees that they will return the shortest path. Additionally, DFS generally returns “deeper” results as it traverses the entire depth of the graph and works backwards to find a solution.


Shortest Path Algorithms


Hill Climbing

The idea of the algorithm is quite simple:

Starting with a known (non-optimized) solution to a function that is to be optimized, the algorithm checks the neighbours of that solution, and chooses the neighbour that is “more” optimized. The process is repeated until no “better” solution can be found, at which point the algorithm terminates.

While the algorithm works relatively well with convex problems, functions with multiple local maxima will often result in an answer that is not the global maximum. It also performs poorly when there are plateaus (a local set of solutions that are all similarly optimized).

HILL-CLIMBING(source,destination) return a route
current ← random route from source to destination
neighbours ← children of current

while min(neighbours) < current do
current ← min(neighbours)
neighbours ← children of current
return current


Here, we introduce a few new ideas.

First, we treat the route between two nodes as a function, the value of which is the distance between the two nodes. Second, we generate “children” of this function.

The function

We need to define a function \(f\) that is our target for optimization.

\(f(x)\) gives us the length of a route for some given route \(x \in Y\), where \(Y\) is the set of all possible routes between two specific nodes.

How do we generate \(x\)? We could just generate random permutations between the two nodes, filtering for permutations that are feasible, and optimize \(f\) over these random, sparse permutations.

However, this method is not reproducible (because the permutations change every run).

Instead, we make a deterministic policy that generates a number of \(x \in Y\) by successively “failing” nodes between the source and destination nodes. We then find the shortest path between the nodes before and after the “failed” nodes.

By failing the nodes in a deterministic fashion, we can say that we have a function and neighbourhood with defined size for a certain value so we can “rigorously” conduct a local search.

To generate our initial route and children routes, we will use the smart_mobility_utilities package. You can see how these routes are generated by consulting the documentation for that package.

Note

The implementation below uses a function called get_children which is located in the smart_mobility_utilities library. It offers both normal and multiprocessed versions in order to speed up processing time.

We will be using the non-multiprocessed version for this example. To see an implementation of the multiprocessed version, see the analysis at the end of this section.

from smart_mobility_utilities.common import randomized_search
from smart_mobility_utilities.children import get_children
import matplotlib.pyplot as plt

# Visualize the costs over time
costs = []
# Set the number of children to generate
num_children = 20

# Process run time for later analysis
start_hill = process_time()

current = randomized_search(G, origin.osmid, destination.osmid)
costs.append(cost(G,current))
print("Initial cost:",costs[0])

neighbours = get_children(G,current,num_children=num_children)
shortest = min(neighbours , key = lambda route : cost(G, route))

print("Initial min(children):",cost(G,shortest))
while cost(G, shortest) < cost(G, current):
    current = shortest
    neighbours = get_children(G,current,num_children=num_children)
    shortest = min(neighbours , key = lambda route : cost(G, route))
    costs.append(cost(G,current))
    print(f"Current cost:",costs[-1],"|","min(children):",cost(G, shortest))

route = current

end_hill = process_time() - start_hill

plt.xticks(range(len(costs)))
plt.title("Hill climbing: Shortest Path")
p = plt.plot(costs)
print("Final cost:",costs[-1])
Initial cost: 1315.842
Initial min(children): 1045.913
Current cost: 1045.913 | min(children): 1040.293
Current cost: 1040.293 | min(children): 1045.913
Final cost: 1040.293
../_images/GraphSearchAlgorithms_33_4.png
draw_route(G,route)

While the implementation above is deterministic in nature, the initial route is still randomized. That means that it’s possible to get different results across runs.

Hill climbing will generally return some decent results as there are few local optimal points in the route function. However, with larger search spaces that will naturally have more local maxima and plateaus, it will get stuck fairly quickly.





Hierarchical Approaches

When facing routing problems at larger scales, such as those involving entire continents or graphs with millions of nodes, it is simply implausible to use basic approaches like Dijkstra or DFS. Instead, routing algorithms “prune” the search space in order to simplify the routing problem. Additionally, routing services may choose to precompute certain routes and cache them on servers, so that response times to user queries are reasonable.

Hierarchical search algorithms prune the search space by generating admissible heuristics that abstract the search space. Read more about the general approach of hierarchical methods at Faster Optimal and Suboptimal Hierarchical Search.

In this section, we’ll give a brief overview of two hierarchical approaches that aim to solve the shortest path problem, and show how their heuristics are computed. There will also be a python implementation of the Contraction Hierarchies example.

Highway Hierarchies

In this algorithm, the hierarchy “level” of each road/arc in the graph is calculated. This distinguishes the type of road segment (i.e. residential, national roads, highways). This is further supplemented by relevant data such as maximum designated driving speed, as well as number of turns in the road. After the heuristics are generated for the graph, the data is passed through a modified search function (bi-directional Dijkstra, A*, etc) that considers the distance to the destination and the potential expansion node class.

For example, the algorithm will generally consider highways as viable expansion nodes when it is still relatively further away from the target, and then will start to include national roads, and finally residential streets as it nears the destination.

While this approach “makes sense”, there are some disadvantages. First, the algorithm largely overlooks what kind of roads humans “prefer” to drive on. That is to say, which a highway might make sense for a given route, the user may “prefer” to take local roads (i.e. driving to a friend’s house who lives nearby). Secondly, highway hierarchies do not take into account factors such as traffic, which fluctuates often and adds significant cost to an “optimal” route.

You can learn more about contraction hierarchies here.

Contraction Hierarchies

While the highway hierarchies algorithm may be useful for speeding up shortest path searches, it only considers three levels of road hierarchies. On the other hand, the contraction hierarchies algorithm introduced by Contraction Hierarchies: Faster and Simpler Hierarchical Routing in Road Networks has the same number of hierarchies as nodes in the graph, which is beneficial as increased number of hierarchies results in more pruning of the search space.

Contraction

Taking any node \(V\) from a graph \(G\), remove \(V\) as well as any connected edges.
Add any number of edges to \(G\) such that the shortest distance for any pair of neighbouring nodes remains the same, even after \(V\) has been removed. Below is an example:

Suppose we are contracting Node 1. By removing the node, we have now changed the shortest path for 0→2, 0→3, and 2→3. We can restore it by creating new edges. After the creation of new edges, we then reinsert any removed nodes and edges. We then update that node with its hierarchy level (which is just the order of contraction, from \(1 to n\)). See the below updated graph:

This process is then repeated for all the other nodes. In our example, no other nodes need to be contracted, as removing any one of the remaining nodes does not affect the rest of the graph in terms of shortest path. This process will form shortcuts, which allow us to search the graph much faster, as we can ignore certain nodes that have been “pruned”.

Contraction Order

Any order for node contraction will result in a successful algorithm, however some contraction ordering systems minimizes the number of new edges added to the graph, and thus the overall running time.

To utilize this, we can employ the idea of edge difference (ED). The ED of a node is \(S-E\), where \(S\) is the number of new edges added if that node were to be contracted, and \(E\) is the number of edges that would be deleted. Minimizing ED is equivalent to minimizing the number of new edges added to the graph, and thus improves processing time. We can calculate the ED of a graph with the below function:

def calculate_edge_difference(G, shortest_paths):
    edge_difference = list()
    seenBefore = list()
    
    for i in G.nodes():
        # used in edge difference calculations
        edges_incident = len(G[i])

        # we will be deleting the node entry
        # from the original shortest paths
        # dictionary so we need to save its state
        # for later iterations
        contracted_node_paths = shortest_paths[i]
        del shortest_paths[i]
        
        # excluding the node that we have just contracted
        new_graph = [*G.nodes()] 
        new_graph.remove(i)
        
        # let's compute the new shortest paths between
        # the nodes of the graph without the contracted
        # node so we can see the changes and add arcs 
        # to the graph accordingly but that is in
        # the algorithm itself 
        new_shortest_paths = dict()

        for source in new_graph:
            new_shortest_paths[source] = dict()
            for destination in new_graph:
                # path the contracted node "i" to compute new shortest paths accordingly
                new_shortest_paths[source][destination] = dijkstra_with_contraction(G, \
                                                                                    source, \
                                                                                    destination, \
                                                                                    contracted = i)
        # the add arcs to keep the graph all pairs shortest paths invariant
        shortcuts = 0

        for source in new_shortest_paths:
            # we get a copy from the original and the new shortest paths dictionary
            SP_contracted = new_shortest_paths[source]
            SP_original = shortest_paths[source]
            for destination in SP_contracted:
                # this is statement so we don't add 2 arcs
                # for the same pair of nodes 
                if [source, destination] in seenBefore: continue
                seenBefore.append(sorted((source,destination)))
                
                # if there is a difference between the original SP and
                # post-contraction SP -- just add new arc
                if SP_contracted[destination] != SP_original[destination]:
                    shortcuts += 1
        
        # let's leave the dictionary as we took it 
        # from the last iteration
        shortest_paths[i] = contracted_node_paths
        
        # this is the value of the contraction
        # heuristic for that node
        ED = shortcuts - edges_incident
        edge_difference.append((i, ED))
    return edge_difference

Example #1: Bi-directional Dijkstra with contraction

After contracting a graph, we can run a bi-directional Dijkstra to compute all the shortest paths in the graph. Following this, any queries for the shortest path between any two nodes in the graph can be solved using a cached result.

There are some restrictions for our modified Dijkstra’s:

  1. The forward expansion only considers arcs \(u,v\), where \(level(u) > level(v)\). This forms the upward graph, where only nodes with a higher contraction order can be relaxed.

  2. The backward expansion only considers arcs \(u,v\), where \(level(u) < level(v)\). This forms the downward graph, where only nodes with a lower contraction order can be relaxed.

These restrictions help “prune” the search space and speed up the processing.

At the end, multiple paths are returned from source to destination, of which the shortest is selected to be the solution.

Let’s try this with the sample graph below:

import networkx
import random
import math

G = networkx.Graph()

# Computing each contraction level requires a new graph, which is costly in terms of memory.
# To avoid this, we use a flag on every node to show its contraction state.

# Add 14 nodes
for i in range(1,15):
    G.add_node(i, contracted=False)

# Define the edges
edges = [
    (1,2,{'weight':1}),
    (1,3,{'weight':4}),
    (2,3,{'weight':5}),
    (2,4,{'weight':2}),
    (3,4,{'weight':2}),
    (3,7,{'weight':2}),
    (3,8,{'weight':1}),
    (3,9,{'weight':1}),
    (4,5,{'weight':5}),
    (5,10,{'weight':7}),
    (6,7,{'weight':4}),
    (6,8,{'weight':3}),
    (6,10,{'weight':3}),
    (6,5,{'weight':3}),
    (6,9,{'weight':1}),
    (7,8,{'weight':6}),
    (8,9,{'weight':3}),
    (8,13,{'weight':5}),
    (9,12,{'weight':1}),
    (9,10,{'weight':3}),
    (10,11,{'weight':4}),
    (11,12,{'weight':3}),
    (11,13,{'weight':4}),
    (12,13,{'weight':2}),
    (14,1,{'weight':3}),
    (14,13,{'weight':2})
]
# Add the edges to the graph and visualize it.
G.add_edges_from([*edges])
networkx.draw(G,with_labels=True)
../_images/GraphSearchAlgorithms_59_0.png

Below is a modified Dijkstra, with the condition that nodes are only considered if contracted=False. Note that all flags are set to False at the end of the algorithm, as graphs are passed by reference.

def dijkstra_with_contraction(G, source, destination, contracted = None):
    networkx.set_node_attributes(G, {contracted: True}, 'contracted')
        
    shortest_path = dict()
    heap = list()
    
    for i in G.nodes():
        if not networkx.get_node_attributes(G, 'contracted')[i]:
            shortest_path[i] = math.inf
            heap.append(i)
    shortest_path[source] = 0
    
    while len(heap) > 0:
        
        q = min(heap, key = lambda node : shortest_path[node])
        if q == destination:
            networkx.set_node_attributes(G, {contracted: False}, 'contracted')
            return shortest_path[q]
        heap.remove(q)
        
        for v in G[q]:
            # if the node is contracted we skip it
            if not networkx.get_node_attributes(G, 'contracted')[v]:
                distance = shortest_path[q] + G[q][v]['weight']
                if distance < shortest_path[v]:
                    shortest_path[v] = distance
                    
    networkx.set_node_attributes(G, {contracted: False}, 'contracted')
    
    return math.inf # if we can't reach the destination

With this complete, the shortest path for every node pair in the graph can now be computed. This will be done without any contraction. Then, the ED heuristic can be calculated for the entire graph.

shortest_paths = dict()
for i in G.nodes():
    # dictionary for every node
    shortest_paths[i] = dict()
    for j in G.nodes():
        # that will be filled with the shortest path
        # between it and other nodes
        shortest_paths[i][j] = dijkstra_with_contraction(G, i, j)

edge_difference = calculate_edge_difference(G,shortest_paths)
edge_difference.sort(key=lambda pair: pair[1])

edge_difference
[(8, -5),
 (10, -4),
 (5, -3),
 (7, -3),
 (11, -3),
 (14, 0),
 (2, 2),
 (1, 3),
 (6, 4),
 (13, 5),
 (4, 7),
 (12, 18),
 (3, 26),
 (9, 33)]

While this is a good start to optimizing the node contraction order, it is by no means perfect. Notice that the ED values calculated above assume the node is the only node removed from the graph. Because we are successively removing every node in the graph, the ED list will potentially become inaccurate after even the first contraction.

Recall however that any arbitrary contraction order results in a successful algorithm. While there are ED heuristics that are able to update the ED list after each contraction, each of them come with their own costs and benefits.

For the purposes of this example, our ED list will be enough.

# to keep track of the edges added after the algorithm finishes
edges_before = [*G.edges()]

current_graph = [*G.nodes()]

# iterating over the tuples (node, level)
# from the sorted edge difference list

for node_ED in edge_difference:
    node = node_ED[0]
    
    # now we will contract the given node through all iterations
    networkx.set_node_attributes(G, {node: True}, 'contracted')
    
    # we have already contracted the node
    # so there is no need 
    new_graph = current_graph
    new_graph.remove(node)
    current_shortest_paths = dict()
    for source in new_graph:
            current_shortest_paths[source] = dict()
            for destination in new_graph:
                current_shortest_paths[source][destination] = dijkstra_with_contraction(G, \
                                                                                    source, \
                                                                                    destination)
                
    for source in current_shortest_paths:
        SP_contracted = current_shortest_paths[source]
        SP_original = shortest_paths[source]
        for destination in SP_contracted:
            if source == destination: continue
            if SP_contracted[destination] != SP_original[destination]:
                print("we have added edge between ", source, destination," after contracting", node)
                
                # it seems like we add two edges instead of one, but this is simple graph
                # so adding edge from a to b and then adding edge from b to a are the same
                # doesn't add a thing, we didn't include condition for that because it will
                # complicate the algorithm
                
                G.add_edge(source, destination, weight=SP_original[destination])
                
    
    current_graph = new_graph
we have added edge between  1 13  after contracting 14
we have added edge between  2 13  after contracting 14
we have added edge between  13 1  after contracting 14
we have added edge between  13 2  after contracting 14
we have added edge between  1 4  after contracting 2
we have added edge between  4 1  after contracting 2
# new edges after adding additional arcs
edges_after = [*G.edges()]

print("# edges before", len(edges_before))
print("# edges after", len(edges_after))
# edges before 26
# edges after 29

We can visualize these “shortcuts” on the graph.

added_edges = list(set(edges_after) - set(edges_before))

# let's color these edges and draw the graph again
colors = ['r' if edge in added_edges else 'k' for edge in G.edges()] 
networkx.draw(G, with_labels=True, edge_color=colors)
../_images/GraphSearchAlgorithms_68_0.png

We can reformat the ED list to become a hierarchy, for the purposes of our bi-directional Dijkstra:

hierarchical_order = dict()
for order, node in enumerate(edge_difference):
    hierarchical_order[node[0]] = order
    
hierarchical_order
{8: 0,
 10: 1,
 5: 2,
 7: 3,
 11: 4,
 14: 5,
 2: 6,
 1: 7,
 6: 8,
 13: 9,
 4: 10,
 12: 11,
 3: 12,
 9: 13}

Let’s now find the shortest path from \(8\) to \(12\).

source = 8
destination = 12
# Generating the upward graph

# initializing 
SP_s = dict()
parent_s = dict()
unrelaxed_s = list()
for node in G.nodes():
    SP_s[node] = math.inf
    parent_s[node] = None
    unrelaxed_s.append(node)
SP_s[source] = 0

# dijkstra
while unrelaxed_s:
    node = min(unrelaxed_s, key = lambda node : SP_s[node])
    unrelaxed_s.remove(node)
    if SP_s[node] == math.inf: break
    for child in G[node]:
        # skip unqualified edges
        if hierarchical_order[child] < hierarchical_order[node]: continue
        distance = SP_s[node] + G[node][child]['weight']
        if distance < SP_s[child]:
            SP_s[child] = distance
            parent_s[child] = node
# Generating the downward graph

# initializing 
SP_t = dict()
parent_t = dict()
unrelaxed_t = list()
for node in G.nodes():
    SP_t[node] = math.inf
    parent_t[node] = None
    unrelaxed_t.append(node)
SP_t[destination] = 0

# dijkstra
while unrelaxed_t:
    node = min(unrelaxed_t, key = lambda node : SP_t[node])
    unrelaxed_t.remove(node)
    if SP_t[node] == math.inf: break
    for child in G[node]:
        # skip unqualified edges
        if hierarchical_order[child] < hierarchical_order[node]: continue
        distance = SP_t[node] + G[node][child]['weight']
        if distance < SP_t[child]:
            SP_t[child] = distance
            parent_t[child] = node

With these, we now merge the common settled nodes from SP_d and SP_s, and find the minimum sum of values, This is our shortest path: the collission between the forward and backward expansions.

minimum = math.inf
merge_node = None
for i in SP_s:
    if SP_t[i] == math.inf: continue
    if SP_t[i] + SP_s[i] < minimum:
        minimum = SP_t[i] + SP_s[i]
        merge_node = i

print("Minimum:",minimum)
print("Merge node:",merge_node)
Minimum: 3
Merge node: 9
# Trace the route back
def route_dijkstra(parent, node):
    route = []
    while node != None:
        route.append(node)
        node = parent[node]
    return route[::-1]

route_from_target = route_dijkstra(parent_t, merge_node)
route_from_source = route_dijkstra(parent_s, merge_node)
route = route_from_source + route_from_target[::-1][1:]
route
[8, 3, 9, 12]

Let’s compare this to networkx’s built in solver:

from networkx.algorithms.shortest_paths.weighted import single_source_dijkstra
single_source_dijkstra(G,source, destination)
(3, [8, 3, 9, 12])

Pruning

We can actually check to see how many nodes were “pruned” from our graph, using the upward and downward graphs generated earlier.

unvisited = 0
for s_node, s_dist in SP_s.items():
    for t_node, t_dist in SP_t.items():
        if s_node == t_node and s_dist == t_dist == math.inf:
            unvisited += 1
print(f"""Skipped {unvisited} nodes from a graph with {len(G)} total nodes,
resulting in pruning {unvisited/len(G)*100}% of the nodes in our search space.""")
Skipped 7 nodes from a graph with 14 total nodes,
resulting in pruning 50.0% of the nodes in our search space.

Summary

Let’s use the processing run times we collected earlier to show a comparison of the processing times for the different algorithms discussed here.

print("BFS:",end_bfs,"s")
print("DFS:",end_dfs,"s")
print("Djikstra:",end_dijkstra,"s")
print("Hill Climb:",end_hill,"s")
print("Beam Search:",end_beam,"s")
print("A* Search:",end_astar,"s")
print("Bi-directional A*:",end_bidirectional,"s")
BFS: 0.013275471000000039 s
DFS: 0.02681750300000374 s
Djikstra: 0.1345126840000006 s
Hill Climb: 15.583480621 s
Beam Search: 135.696005033 s
A* Search: 0.007053232000004073 s
Bi-directional A*: 0.013135845000022073 s